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We address the following problem: given two smooth densities on a manifold, 
find an optimal diffeomorphism that transforms one density into the other. Our 
framework builds on connections between the Fisher-Rao information metric 
on the space of probability densities and right-invariant metrics on the infinite¬ 
dimensional manifold of diffeomorphisms. This optimal information transport, 
and modifications thereof, allows us to construct numerical algorithms for den¬ 
sity matching. The algorithms are inherently more efficient than those based 
on optimal mass transport or diffeomorphic registration. Our methods have 
applications in medical image registration, texture mapping, image morphing, 
non-uniform random sampling, and mesh adaptivity. Some of these applications 
are illustrated in examples. 

Keywords: density matching, information geometry, Fisher-Rao metric, op¬ 
timal transport, image registration, diffeomorphism groups, random sampling 
MSC2010: 58E50, 49Q10, 58E10 

Contents 


1. Introduction 2 

2. Geometric foundation 5 

2.1. Notation. 5 

2.2. Space of densities and the Eisher-Rao metric. 6 

2.3. Group of diffeomorphisms and the information metric. 8 

2.4. Moser’s principal bundle structure. 9 


* Supported by FWF-Project “Geometry of shape spaces and related infinite dimensional spaces” (P246251) 
^ Supported by the Swedish Research Council and the Swedish Foundation for Strategic Research. 


1 







2.5. Riemannian submersions and descending metrics. 10 

2.6. Base manifold with infinite volume. 11 

3. Matching with compatible background metric 11 

3.1. Horizontal lifting of paths of densities. 12 

3.2. Exact compatible density matching (optimal information transport) . 14 

3.3. Inexact compatible density matching. 16 

4. Examples of matching with compatible metric 16 

4.1. Random sampling from non-uniform probability distributions. 16 

4.2. Registration of letters: J to V. 18 

5. Matching with non-compatible background metric 19 

5.1. Inexact matching with the divergence-metric. 19 

5.2. Gradient flow for inexact matching. 21 

5.3. Geometry of the gradient flow. 24 

5.4. Two-component gradient flow. 26 

5.5. Relation to matching with compatible background metric. 26 

6. Examples of matching with non-compatible metric 27 

6.1. Registration of letters: J to V. 27 

6.2. Registration of noisy x-ray images . 29 

A. Constructing compatible background metrics 30 

A.l. Gonformal metric. 30 

A.2. Gonstructing a flat compatible metric . 31 

A.3. Symmetric matching. 32 

B. Relation between Fisher-Rao and other distances 32 


1. Introduction 

In this paper we study the problem of finding diffeomorphic (bijective and smooth) trans¬ 
formations between two densities (volume forms) on an n-manifold M equipped with a 
Riemannian metric g and volume form vol. This has applications in many image analysis 
problems and is an extension of the classical image registration problem. Specific appli¬ 
cations of density matching include: medical image registration [17, 35, 37, 16]; texture 
mapping in computer graphics [10, 39]; image morphing techniques [40, 31]; random sam¬ 
pling in Bayesian inference [27, 32]; mesh adaptivity in computational methods for PDEs [9]. 
A more extensive list of applications and algorithms is given in [30]. 

The difference between classical image registration (cf. [38]) and density matching is the 
way transformations act. In image registration, transformations act on positive scalar func¬ 
tions (images) by composition from the right. In density matching, transformations act by 
pullback of volume forms: if the density is represented by a function I: M ^ IR+, the action 
is given by 

\D(fi\I oip, 

where ip\ M ^ M is the transformation and \D(p\ is its Jacobian determinant. 
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When studying geometric aspects of density matching, it is convenient to use the frame¬ 
work of exterior calculus of differential forms. A density is then thought of as a volume 
form, and the action is given by pullback: 

H> if*II. 

The volume form /i is related to / by // = /vol. I is the Radon-Nikodym derivative of /i 
with respect to vol. For the convenience, we use both the function and the exterior calculus 
point-of-view throughout the paper; the relation between functions and volume forms is 
always understood to be /i = /vol. 

Let Diff(M) and Dens(M) respectively denote diffeomorphisms and normalized, smooth 
densities on M. Both Diff(M) and Dens(M) are infinite-dimensional manifolds (see § 2 
for details). Let G denote a Riemannian metric on Diff(M) with corresponding distance 
function ^ 

inf f G^(t) (7(f), 7(f)) df. 

^{0)=Lp,^{l)='lp Jq 

Likewise, let G denote a Riemannian metric on Dens(M) with distance function d. We 
are interested in special cases of the following two, generally formulated density matching 
problems: 

Problem 1 (Exact density matching). Given ^ Dens(M), find cp G Diff(M) mini¬ 

mizing d(id, (p)‘^ under the constraint 

III = (p*llo ■= {(p~^)*iio- 

Equivalently, using intensity functions Iq and /i, the constraint is 

Ii = \D^-^\{Ioocp-^). 


Problem 2 (Inexact density matching). Given G Dens(M), find cp G Diff(M) mini¬ 

mizing 

E{ip) := a(f{id,ip)+(P{ip^:iio,lii)‘^, cr > 0 . (3) 

The first term in (3) is a regularity measure: it penalizes irregularity of (p. The second term 
is a similarity measure: it penalizes dissimilarity between and pi. The parameter a is 
balancing the two criteria. 

There is no intrinsic choice of G and G; they are free to be specified and evaluated in the 
specific application. The following choices are, however, typically considered. 

(i) Eor Problem 1, the standard choice is distance-squared optimal mass transport (OMT), 
corresponding to the non-invariant metric 

G^{U,V)= [ g(G,E)/io, G,EGT^Diff(M), 

Jm 

= / g{u,v)p^po, u:=U op-^^v :=V op~^ . 

JM 

This choice of metric induces the Wasserstein distance on Dens(M) [29, 36]. 
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(ii) For Problem 2, a common choice [34, 19] is the right-invariant metric 

G^{U,V)= [ g{{l-aAfu,v)Yo\, 

JM 

where A is the Laplace-de Rham operator lifted to vector fields (see § 2 for details), 
and, as similarity measure, the norm 

= IImo - Mi|Il2 := [ |/o-^i|^vol. 

JM 

This setting is similar to Large Deformation Diffeomorphie Metrie Matehing (LD- 
DMM) [20, 24, 34], but with the density action (1) instead of the composition action. 

Both choices (i) and (ii) are computationally challenging, as they require the numerical 
solution of nonlinear partial differential equations (the Monge-Ampere and EPDiff equations 
respectively). See [30] and [38] for efficient and stable implementations. 

In this paper we consider metrics for Problem 1 and Problem 2 that reduce the computa¬ 
tional challenge to solving Poisson equations, allowing significantly faster, semi-explicit algo¬ 
rithms. Our approach is based on connections between information geometry and geodesic 
equations on diffeomorphisms (cf. topologieal hydrodynamies [2, 22]), now outlined. 

Khesin, Lenells, Misiolek, and Preston [21] found that the degenerate divergenee-metrie 
on Diff(M), given by 

G‘^"'{U,V)= [ div(M) div(v)vol, (4) 

JM 

induces a non-degenerate metric on the quotient space Diffvoi(A/’)\Diff(M) of right co-sets. 
Here, Diffvoi(Af) C Diff(M) denotes volume-preserving diffeomorphisms 

(f G DiSyoi{M) (/?*vol = vol. 


From Moser’s principal bundle structure Diffvoi(Af)\Diff(M) Dens(M) (see §2.4 for de¬ 
tails) it follows that the divergence-metric (4) induces a metric on Dens(M). Remarkably, 
this is the infinite-dimensional version of the Fisher-Rao metrie (also called Fisher’s infor¬ 
mation metrie)^ predominant in information geometry. It is given by 


Gf,Xa,P) = jf --fi, 

^ JA 


M L L 


( 5 ) 


for tangent vectors P G T^Dens(M). It can be interpreted as the Hessian of relative 
entropy, or information divergence. 

Due to the degeneracy, the geometry of the divergence-metric is not ideal: instead one 
would like a non-degenerate metric on Diff(M) where the projection from Diff(M) to 
Dens(M) is a Riemannian submersion (see § 2.5 for details). As remarked in [21], none 
of the standard right-invariant, Sobolev-type metrics on Diff(M) have this property. Nev¬ 
ertheless, it is possible to construct them: a three-parameter family was introduced in [25], 
where also a complete characterization of such metrics is given. Here we focus on a specific 
member of the family, namely 


k 

Gl{U,V)= [ g( Am, t))vol + A V f g{u,^i)vo\ [ g(v,Ci)vol, 

JM JM JM 


(6) 
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where A > 0, A is the Hodge-Laplacian lifted to vector fields, and ..., is an orthonormal 
basis of the harmonic vector fields on M (see § 2.1 for details). We call the information 
metric, as it is descends to the Fisher-Rao information metric as explained in § 2.5. 

The descending property of (6) implies horizontal geodesics. Such horizontal geodesics 
describe optimal transformations between probability densities, which leads to optimal in¬ 
formation transport (OIT)—an information theoretic analogue of OMT. The analogue to the 
Wasserstein distance in OMT is the spherical Hellinger distance induced by the Fisher-Rao 
metric. The analogue to Brenier’s polar factorization [7], which solves the OMT problem, 
is the information factorization of diffeomorphisms [25, § 5], which solves the OIT prob¬ 
lem. Because of the descending property, Fisher-Rao geodesics on Dens(M) can be lifted to 
horizontal geodesics on Diff(M). This observation is the main ingredient of our algorithms. 

We consider Problem 1 and Problem 2 in two different settings, leading to different al¬ 
gorithms. The first contribution, developed in § 3, considers a background metric that is 
compatible in the sense that /io = vol. This has applications in texture mapping, random 
sampling, and mesh adaptivity. Numerical examples are given in § 4. The second contri¬ 
bution, developed in § 5, makes no compatibility assumption but uses a slightly modified 
optimality condition to retain efficiency. This has applications in image morphing and image 
registration. Numerical examples are given in § 6. The computational complexity of both 
algorithms is significantly lower than those based on OMT or LDDMM. 

The emphasis of the paper is on mathematical foundations rather than applications. The 
examples in § 4 and § 6 are therefore kept simple and illustrative. 

2. Geometric foundation 

2.1. Notation 

Throughout the paper, the word “metric” always means “Riemannian metric” and “dis¬ 
tance” always means “Riemannian distance”. 

Let M be an n-dimensional orient able manifold with metric g. We refer to g as the back¬ 
ground metric, to distinguish it from metrics on infinite-dimensional spaces. The background 
metric g induces a volume form on M, denoted volg. In oriented coordinates x^,... ,x^, the 
expression for volg is 

volg = visl A • • • A dx'^, 

where |g| denotes the determinant of the metric tensor. When the background metric g is 
clear from the context, we write vol instead of volg. 

The total volume of M with respect to vol, for now assumed to be finite, is denoted 

vol(M) := f vol. 

Jm 

The space of smooth, real valued function on M is denoted C^{M). The space of smooth 
vector fields and /c-form on M are denoted X(M) and Qf{M) respectively. The background 
metric g induces the musical isomorphism b: X(M) ^ 0^(M) by := g{u, •). Its inverse is 
denoted so u = If a G and /i is a volume form on M, then ^ is the function 

on M defined by 
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Notice that this is used in the definition of the Fisher-Rao metric (5). If u G X(M), we 
sometimes use the notation u • v instead of g{u, v). 

A volume form /i on M induces a divergence, defined via 

CufJ- = div^{u)n, 

where jCu denotes the Lie derivative in direction of the vector field u G X(M). When /i = vol, 
we write div{u) instead of diVvoi('^)- 

The gradient of a function f on M with respect to g is defined by 

gradg/ = (d/)**, 

where d: is the natural differential of forms. Again, if g is clear from 

the context, we simply write grad/. 

Recall the Laplace operator A defined by A/ = divgrad/. We also use A to de¬ 
note the lifted Laplace-de Rham operator on the space of vector fields, defined by Au = 
— {Sdu^ + dSu^)^^ where S: is the codifferential operator (contrary to 

the differential d, the codifferential 5 depends on the background metric g). We sometimes 
denote the Laplacian by Ag when the dependence on g needs to be stressed. The space of 
harmonic vector fields is given by 

io(M) = {eGX(M);Ag/ = 0}. 

If M is a closed manifold (compact and without boundary), then S){M) is finite-dimensional. 
If (M, g) is flat, then Ag on vector fields is the standard Laplacian on functions applied 
elementwise. 

2.2. Space of densities and the Fisher-Rao metric 

The space of densities Dens(M) consists of smooth volume forms with total volume vol(M): 

Dens(M) = {/r G [ ja = vol(M), /i > 0}. 

Jm 

We like to think of Dens(M) as an infinite-dimensional manifold. To make this rigorous, 
first observe that if M is compact, then the space of top-forms O’^(M) is a Frechet space 
with the topology induced by the Sobolev seminorms (see Hamilton [18, Example 1.1.5] for 
details). Let c G M. Then the set = {a e fl’^(M); a = c} is a closed, affine 

subspace of and Dens(M) is an open subset of Therefore, Dens(M) 

is a Frechet manifold (cf. [18, Chapter 4]). The closure of Dens(M) is a Frechet manifold 
with boundary, given by 


Dens(M) = {a e > 0}. 

Since Dens(M) is an open subset of a closed, afhne subspace of a vector space, its tangent 
space at /r is given by 

T^Dens(M) =n^{M). 

Notice that T^Dens(M) is independent of /i, so the tangent bundle is trivial TDens(M) 
Dens(M) x 
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As an alternative to the Frechet topology discussed here, one might work with the com¬ 
pletion of Dens(M) in the Sobolev topology for a differentiability index s. This space, 
denoted Dens^(M), then becomes a Banach manifold (see [25] for details). The results in 
this paper are valid in both the Frechet and the Banach category. 

In the case when M is not compact, an infinite-dimensional manifold structure can still 
be given, as discussed in § 2.6. 

Recall the Fisher-Rao metric on Dens(M) is given by equation (5). This metric 
is weak in the Frechet (or Banach) topology. Nevertheless, its geodesics are well-posed. 
Indeed, the astonishing property of the Fisher-Rao metric is that its geodesics are explicit. 
Following [14], we introduce the IF-map 


VF: Dens(M) ^ C^(M), 



( 7 ) 


The infinite-dimensional sphere = {/ G C^(M); p vol = vol(M)} is a subman¬ 

ifold of C^{M) and the image of IF is a open subset of S'^{M) [21, Theorem 3.2]. Indeed, 
if /i G Dens(M), then 


ol(M)=/ [ —,yol=[ WiiifYol=:\\Wi^i)\\ 

JM J M Jm 


Let a G T^Dens(M) and let p := T^W • a = Then 



M = Gl{a,a). 


We have thus showed that IF is an isometry between Dens(M) with the Fisher-Rao metric 
and an open subset of S^{M). Since the geodesics of the infinite-dimensional sphere S^{M) 
are explicitly known, we obtain the geodesics on Dens(M). Indeed, the Fisher-Rao geodesic 
between jiQ and pi is given by 


[ 0 , 1 ] 3 11 —y 


sin ((1 — t)0) 
sinO 


fo 


sin (tO) 
sin^ 



2 

vol. 


0 = arccos 


f {fo, / i ) l 2 a 

\ vol(M) J 


( 8 ) 


where fo = W{jio) and fi = IF(/ii). 

A direct consequence of the formulae (8) is an explicit formula for the induced geodesic 
distance. Indeed, if dp denotes the distance function of the Fisher-Rao metric, then 


dF{l^o,hi) = \/vol(M) arccos 



(9) 


As already mentioned, this formula for dpipo^Pi) is the key ingredient for our density 
matching algorithms. It is a spherical version of the Hellinger distance [21]. In Appendix B 
we compare the geodesic distance of the Fisher-Rao metric to other commonly used distance 
functions on the space of probability measures. 

Remark 2.1. Recall that the Fisher-Rao metric on Dens(M) is canonical: it does not 
depend on the choice of background metric. For the IF-map in equation (7), this implies 
that vol can be any volume form, as long as p is absolutely continuous with respect to it. 
In particular, as in Example 3.6 below, it does not have to be the volume form associated 
with the background metric. 
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Remark 2.2. In information geometry [1], a finite-dimensional submanifold F C Dens(M) 
is called a statistical manifold. The Fisher-Rao metric on Dens(M) induces a metric on F; 
in local coordinates 0 = ... ,0^) G it is given by 


G] 




d\np{x,0) dinp{x,0) 


d9i 




p(x, 0)vol{x), 




M ^^3 

d \np{x, 0) d \np{x, 6) 


dOi 


d0i 


I 

Jm 


d\/p{x,0) d^p{x, 0) 


d0i 


d0i 


vol(x) 


where 9 j9(-, ^)vol G Dens(M) is the local coordinate chart. The tensor field G^ (^) is the 

classical information matrix of Fisher [13]. 


2.3. Group of diffeomorphisms and the information metric 

The set of diffeomorphisms on M is denoted Diff(M); it consists of smooth bijective map¬ 
pings M ^ M with smooth inverses. This set has a natural group structure, by composition 
of maps. If M is compact, then Diff(M) is a Frechet Lie group [18, §F4.6], i.e., a Frechet 
manifold where the group operations are smooth mappings. The Lie algebra of Diff(M) is 
given by the space X(M) of smooth vector fields (tangential if M has a boundary). There 
is a natural choice of an inner product on X(M), given by 

{u,v)l2 := / g('u,i;)vol. (10) 

JM 

The tangent space T^Diff(M) consists of maps U: M ^ TM with U o (p~^ e X{M). 

As with the space of densities, one can also chose to work in the Sobolev completion 
Diff^(M). For large enough s, the set Diff^(M) is a Banach manifold. It is, however, not 
a Banach Lie group, because left composition is not smooth, only continuous—an issue to 
be carefully addressed when deriving rigorous existence results for geodesics equations on 
Diff^(M) (see for example [12, 25]). The case of non-compact M is discussed in §2.6. 

Recall that we are interested in the information metric on Diff(M), defined in equa¬ 
tion (6). Again, this metric is weak in the Frechet (or Banach) topology. Nevertheless, 
the geodesics are well-posed: local existence and uniqueness of the geodesic equation on 
Diff^(M) with the Banach topology is given in [25, § 3]; the result is extended to Diff(M) 
with the Frechet topology by standard techniques as in [12]. 

The metric G^ has the property of right-invariance: if U^V G T^Diff(M) then 

V) = ° V’V ° ^), G Diff(M). 

This property implies that the geodesic equation can be stated in terms of the reduced 
variable u = ip o ip~^ g X(M), given by 

d 

—m + CuTn + m div u = 0, m? = Au^ 
at 
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where A: X(M) X(M) is the inertial operator associated with the inner product •) 

on X(M). Explicitly, 

k p 

Au = Au + X'^Ci g{u,^i)vo\, ( 11 ) 

i=l 

where is an basis for the space of harmonic vector fields orthogonal with 

respect to the inner product (10). Since is a non-degenerate metric, the inertia 
operator A is invertible. Let us now compute its inverse. 

First, it follows from the Hodge decomposition of 1-forms that the space of vector fields 
admits the L^-orthogonal decomposition X(M) = €(M) 0io(M), where €(M) is the image 
of the Laplace-de Rham operator A: X(M) ^ X(M). The inertia operator is diagonal with 
respect to this decomposition: if u = w -j- ^ are the components then 

A(w + 0 = ^ ^ 

Since A is a automorphism on ^(M), the inverse A“^: €(M) ^ €(M) is well-defined, so 

A-i(u; + C) = A-i«;+le 

A 

To compute the components w and ^ of u ^ X(M), it suffices to first compute 

k 

i=l 

and then set w = u — We have thus computed the inverse A ~^: X(M) —> X(M). 

2.4. Moser’s principal bundle structure 

Recall from § 1 that the diffeomorphism group Diff(M) acts from the right on the space of 
densities Dens(M) via equation (1). This action is not free: the isotropy subgroup Diff^(M) 
of /i G Dens(M) (also called stabilizer of ja) consists those diffeomorphisms that are volume 
preserving with respect to given by 

Diff^(M) := {if G Diff(M); = /i}. 

The action is, however, transitive, proved by Moser [28] for Diff(M) and Dens(M) and by 
Ebin and Marsden [12] for Diff^(M) and Dens^~^(M); for any pair of densities /r there 
exists a diffeomorphism ip such that ip^'v = ji. If we fix a reference density p G Dens(M), 
we can therefore identify Dens(M) with the quotient space of right co-sets 

[ip] := Diff^(M) oipe Diff^(M)\Diff(M). 

Indeed, if -0 G [ip], then = ip*fi, so the map 

Diff^(M)\Diff(M) 3 [ip] ^ ipy G Dens(M) 

is well-defined. It is injective, by construction, and surjective, by Moser’s transitivity result. 
The projection map 


TT^ : Diff(M) 3ip^ipye Dens(M) 
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thereby provides a principal bundle structure 


Diff^(M)C—^Diff(M) (12) 

Dens(M). 

That is, the preimage 7r“^(A) of each A G Dens(M) is a fibre in Diff(M), and each fibre 
is parameterized by the left action of Diff^(M) on Diff(M), see Hamilton [18, §111.2.5] for 
details. When /i = vol, we write tt instead of tTvoI- 

Remark 2.3. In §2.2 we mapped Dens(M) to the a subset of S^{M) via the W-mapping 
in equation (7). The reason for this map was the simple interpretation of the Fisher-Rao 
metric in this representation (as the sphere metric). Through this map, Diff(M) also acts 
on Indeed, for / G 5'^(M) and ip G Diff(M) the action is given by 

f~k(p:= P\Dip\{f oif). 

As required, IT((^*/i) = W{/j.) 'kip. 

2.5. Riemannian submersions and descending metrics 

In this section we show how that the information metric and the Fisher-Rao metric 
gives a Riemannian structure to the principal bundle (12). This Riemannian structure is 
the key to our density matching algorithms in § 3 and § 5. 

Moser’s result on transitivity implies that tt: Diff(M) -G Dens(M) is a submersion: it is 
smooth and its derivative is surjective at every point. Following the work in [25, § 4], we 
now show that tt is, in fact, a Riemannian submersion with respect to G^ and G^. 

Let C Tc^Diff(M) denote the vertieal distribution given by the tangent spaces of the 
fibres of the principal bundle structure (12): 

U eV^ ^ T^tt • = 0. 

The horizontal distribution C T^Diff(M) with respect to the information metric G^ is 
given by 

U ^ GliU,V) = 0, vy e 

In other words, Hip is the orthogonal complement of V^. The metric descends to 
through the principle bundle structure (12). That is 

G^PU, y) = Gp■U,T^TTpi-V), \fU,VeH^. 

A remarkable property of descending metrics is that initially horizontal geodesics remain 
horizontal: if ip{t) is a geodesic curve and (/9(0) G then ip{t) G R(p{t) for all t. Fur¬ 

thermore, if ip{t) G Diff(M) is a horizontal geodesic curve, then fi{t) := 7r{ip{t)) is a geodesic 
curve on Dens(M). 

In summary, we have the following result. 

Lemma 2.4 ([25]). Under the identifieation Dens(M) Diffvoi(Af)\Diff(M), the infor¬ 
mation metrie Gfi given by (6), deseends to the Fisher-Rao metrie G^, given by (5), i.e., 
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tt; (Diff(M),G^) ^ (Dens(M),( 5 ^) is a Riemannian submersion. The horizontal distribu¬ 
tion is right-invariant, given by 

n^ = {Ue r^Diff(M); [/ O <^-1 = grad(/), / G C°°{M)} . 


Remark 2.5. In [4] it was shown that the Fisher-Rao metric is the unique metric on 
Dens(M) that is invariant under the action of the diffeomorphism group. As a consequence, 
any right-invariant metric on Diff(M) that descends to a metric on Dens(M) through the 
principal bundle structure ( 12 ) descends to the Fisher-Rao metric. 

Remark 2.6. The condition for a right-invariant metric to descend to right cosets is 
transversal to the condition for a subgroup to be totally geodesic. See [26] for details. 


2.6. Base manifold with infinite volume 

In the setting so far we assumed that M is compact and that vol(M) is finite. In some 
applications it is useful to drop this assumption and consider non-compact manifolds with 
infinite volume, in particular M = MT. By imposing decay conditions on the set of densi¬ 
ties and diffeomorphisms, the previously described theory continues to hold, as now briefly 
explained. For a detailed exposition in the case of the diffeomorphism group on M see [3]. 

Let vol be the volume form of a non-compact Riemannian manifold M. We introduce 
compactly supported densities, diffeomorphisms, and functions via 


DenSc(M) := {/ vol G Dens(M); {x;I{x) 7 ^ 1} has compact closure} 
Diffc(M) := {(f e Diff(M); {ip; ip{x) 7 ^ x} has compact closure} 
C^{M) := {/ G f has compact support} . 


With these decay conditions, the theory described in §2.2-§2.5 for the compact case extends 
to the non-compact, infinite volume case. In particular, Moser’s Lemma is still valid. 

Then the space of compactly supported densities is an open subset of a sphere with infinite 
radius, thus a fiat space. To see this, we slightly modify the W-mapping (7) to 


W : Dens(M) ^ C^{M), 



Using this mapping, the formula (8) for geodesics on DenSc(M) simplifies to 


[0, l]^t^ W-^ ((1 - t)W(Mo) + tW(Mi)) 


and the induced geodesic distance is given by the Hellinger distance. 


3 . Matching with compatibie background metric 

In this section we derive efficient algorithms to solve Problem 1 and Problem 2 with respect 
to the information metric ( 6 ) on Diff(M) and the Fisher-Rao metric (5) on Dens(M) and a 
background metric g on M fulfilling the compatibility constraints volg = /io- This property 
is fulfilled in some applications of density matching, for example texture mapping, random 
sampling, and mesh adaptivity. 
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An integral component of our method is the ability to horizontally lift paths in Dens(M) 
to the diffeomorphism group. Indeed, the selection of on Diff(M) and on Dens(M) 
fulfils two central properties, (i) Fisher-Rao geodesics on Dens(M) are explicit [14], and (ii) 
the metrics (6) and (5) are related via a principal bundle structure [25]. It is these properties 
that allow us to construct fast, explicit algorithms. 


3.1. Horizontal lifting of paths of densities 

Given a path of densities /i(t) G Dens(M) we want to find a path of diffeomorphisms ip{t) 
that project onto /i(t) with respect to Moser’s principal bundle (12), i.e., 

7’'^(0)(¥’W) = V’WXO) = (13) 

Such a path is not unique, since we can compose any solution (f from the left with any dif¬ 
feomorphisms ^ G Diff^('o)(M). To address the non-uniqueness, we therefore consider paths 
of diffeomorphisms fulfilling (13) while of minimal length with respect to the information 
metric G^, given by (6). Mathematically, this problem is formulated as follows: 

Problem 3 (Horizontal lifting). Given a path of densities fi G G^([0,1],Dens(M)), find a 
path of diffeomorphisms G G^([0, l],Diff(M)) fulfilling 

p{0) = id, 

p{tyfi{0) = fi{t) , (14) 


while minimizing 





In general there is no easy way to solve this problem. If, however, the background metric 
fulfils the following compatibility condition, then Problem 3 reduces to solving Poisson 
equations. 

Definition 3.1 (Compatible background metric). Let /i G Dens(M). The background 
metric g on M is called compatible with p if volg = fi. 

The following result explains the advantage of having a background metric compatible 
with /i(0). 

Lemma 3.2. Let the background metric g on M be compatible with /i(0). Then there is a 
unique path of diffeomorphisms solving Problem 3. This path is horizontal with respect to 
the information metric (6). 

Proof To prove this statement we differentiate the equation (14) for ja{t): 

ii(t) = a((</?(i)X0)) = (15) 

Here v{t) G X{M) denotes the right trivialized derivative of <p, i.e., tp o = v. Using the 
compatibility condition, we get 

fi{t) = (/?(t)*£„(t)Volg = div(v(t)) o p{t) p{t). (16) 
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By the Hodge-Helmholz decomposition for vector fields, we can write v as 


V = grad / + re , 


for some function / and a divergence free part w. Equation (15) only determines the diver¬ 
gence part of the vector field which allows us to choose w freely. Since the Hodge-Helmholz 
decomposition is orthogonal with respect to the information metric the length of cp is 
minimal for re = 0. This is the horizontality condition, described in § 2.5. That the solu¬ 
tion is unique follows from uniqueness of the information factorization of diffeomorphisms, 
see [25, §5]. □ 

From Lemma 3.2 we obtain an equation for the solution of the lifting problem. 

Theorem 3.3. Under the same eonditions as in Lemma 3.2, the unique solution of Prob¬ 
lem 3 is obtained by solving the Poisson equation 


v{t) = grad(/(e), 

ip{t) = v{t) oip{t), (^(0) = id. 


(17) 


Proof The horizontal bundle is generated by gradient vector fields v 
function /. Thus, 

div{v{t)) = div(grad(/)) = A/. 

From equation (16) we then obtain (17). 


grad/ for some 


□ 


Remark 3.4. We can use the equivariance of the Laplacian and the gradient to rewrite the 
differential equation (17) as 


A^(t).,(/(t)o^(t)) = ^, 


ip{t)*v{t) = grady.g(/(t) o 

ip(t) = v{t)oi^(t), </?(0) = id. 


If we introduce h{t) := /(t) o ip{t),w{t) := (p{tyv{t) and g{t) := (p{tyg the above equations 
become 


wit)) = gr&dg^phit)), 

(fit) =Tid(p~^-wit), (^(0 )=id. 


The main difference to equation (17) is the time-dependence of the Laplacian in Poisson’s 
equation. 

A numerical algorithm for solving Problem 3 is now given as follows. 
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Algorithm 1. Assume we have a numerical way to represent functions, vector fields, and 
diffeomorphisms on M, and numerical methods for (i) composing functions and vector fields 
with diffeomorphisms, (ii) computing the gradient of functions, and (iii) computing the 
inverse of the Laplace operator A. Given a C^-path of densities [0, 1] 3 t /i(t) with 
/i( 0 ) = vol, a numerical algorithm for computing discrete lifted paths {(pk}k=o 
is given as follows: 

1. Chose a step-size 5 = 1/N for some positive integer N. Initialize to = 0, cpo = id, and 

= id. Set k ^ 0. 

2 . Compute ° solve the Poisson equation 

Afk = 4 . 

3. Compute the gradient vector field = grad//.. 

4. Construct approximations ipk to ex.p{evk) and to exp(—^u/.), for example 

V^/c = id + svk, = id - evk. 

5. Update the diffeomorphisms 

‘Pk+i = i’k °‘fik, <^fc+i = 

6 . Set k ^ k 1 and continue from step 2 unless k = N. 

3.2. Exact compatible density matching (optimal information transport) 

The special case of Problem 1 with and for infinite-dimensional metrics and a com¬ 
patible background metric gives OIT. 

Problem 4 (Optimal information transport). Given /ii G Dens(M), find ip G Diff(M) 
minimizing di{id^(p) under the constraint 


/ii = (^*vol. 


(18a) 


Equivalently, using the density function Ii for /ri, the constraint is 

Ji = 


To better conform to the horizontal lifting setup in § 3.1, which uses the pullback rather 
than pushforward action, we notice that if (/:? is a solution to Problem 4, then ip~^ is a solution 
to the same problem but with pullback in (18a) instead of pushforward. This follows from 
right-invariance of G^, as di{id^ip) = di{id^ (p~^). 

The following result is a direct consequence of the information factorization of diffeomor¬ 
phisms [25, Theorem 5.6]. 

Theorem 3.5. Problem 4 has a unique solution. Its inverse is given by the end-point of the 
solution to the lifting equations (17) for the path /i(t) given by the Fisher-Rao geodesie (8) 
with /io = vol. 
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It follows that a numerical algorithm for Problem 4 is given by Algorithm 1 with /i(t) as 
in Theorem 3.5. This algorithm is demonstrated in § 4 below. Before that, we solve the 
lifting equations explicitly in the one-dimensional case. 

Example 3.6. We want to explicitly solve the Problem 4 in dimension one. Let /io = /o dx, 
/ii = /i dx be two arbitrary densities on M = 5'^ M/Z. Since M is one-dimensional we 
could solve this problem directly, using that, up to translations, the diffeomorphism ip is 
determined by the matching constraint only. We shall, however, refrain from using this fact 
and instead solve the lifting equations (17). 

The standard metric on is not compatible with po unless /o = 1. Nevertheless, it is 
straightforward to construct a compatible background metric: choose g = Iq dx0dx. Then 
volg = po as required. In contrast to the higher-dimensional case, this choice of a compatible 
background metric is uniquely determined by Jq; two different ways to construct compatible 
metrics in the higher-dimensional case are described in Appendix A. 

Using Theorem 3.5 we are now able to obtain the solution. To simplify the notation, let 
/o = ^/^o and fi = v/Ti? corresponding to the W-map W{p) = pfdx. First we recall the 
formula (8) for the Fisher-Rao geodesics on Dens(5'^), i.e.. 



where the angle 0 is given by 



To calculate the lifting equations we need a formula for the gradient and Laplacian of the 
metric g. For any function / we have 


grad (/) = Y^cof, ^gf = 44(/o4/) 
■'0 Jo 


These formulas are derived in a more general setting in § A.l. To simplify the notation we 



ji(t) 20 {cos{0 — tO)fo{x) — cos{tO)fi{x)) 
li{t) fi{x) sin(t6>) + sin(6> - tO)fo{x) 


The lifting equations (17) now becomes 



Jo 


The solution to this PDF is given by 



Evaluating at t = 1 we obtain 
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3.3. Inexact compatible density matching 

We are now interested in the special case of Problem 2 with and for infinite¬ 
dimensional metrics and a compatible background metric. 

Problem 5 (Inexact, compatible density matching). Given /ii G Dens(M), find cp G 
Diff(M) minimizing 

E{ip) = adj{id, ip) + d^p{ip^Yo\, /ii). 
where a > 0 is a fixed parameter. 

As in Problem 4, (/? is a minimizer of E{ip) if and only if 0 = ip~^ is a minimizer of 

E{(j)) = crd^(id, (j)) + d|.(0*vol, pi). (19) 

Our approach for Problem 5 is to minimize (19) and then obtain the solution by taking the 
inverse. From Lemma 2.4, i.e., that G^ descends to G^, it follows that the lifting equations 
can be used also to obtain the solution to Problem 5. 

Theorem 3.7. Problem 5 has a unique solution, obtained as follows. Let ip{t) be the hor¬ 
izontally lifted geodesie sueh that ip{l)~^ solves Problem 4 for the same pi. Let s = 

Then the solution is given by ip{s)~^. 

Proof A minimizer (f of (19) must be connected to the identity by a horizontal geodesic 
(otherwise it would be possible to find another diffeomorphism with a strictly smaller value 
of E, using [25, Theorem 5.6]). Therefore, minimizing E is equivalent to minimizing the 
functional e(/i) := crJ|.(vol,/i) + /ii) on Dens(M). First, notice that e is convex on 

Dens(M), so a unique minimizer exists. Denote it u. From the spherical geometry of 
(Dens(M), G^), explained in § 2.3, it is cleat that n must belong to the geodesic curve /a{t) 
between vol and pi. Without loss of generality, we may assume that the distance between 
vol and pi is 1. Using arclength parametrization /i(s), we then get e(/i(s)) = as^ + (1 ~ <5)^- 
Minimization over s now proves the result. □ 

From Theorem 3.7 it follows that a numerical algorithm for Problem 5 is obtained through 
Algorithm 1 by solving the lifting equations until reaching t = l/(cr + 1) and then take the 
inverse. 

4. Examples of matching with compatible metric 

In this section give some examples of matching with a compatible metric, using Algorithm 1. 

4.1. Random sampling from non-uniform probabiiity distributions 

In M, a classical algorithm for generating random samples from an arbitrary probability 
density function is to use the result in Example 3.6. That is, one uses the cumulative 
distribution function to transform the standard uniform random variable on the unit interval. 
Algorithm 1 can analogously be used to transform uniform random samples to samples from 
an arbitrary probability density on M. 
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As an example, let M = (the two-dimensional flat toms). We want to produce random 
samples from an arbitrary probability distribution on T^, for example 


/ii = (l — 0.8 cos(x) cos(2^)) dx A dy 

vol 

where x^y e [—tt, tt) are coordinates. The approach is to first use the lifting algorithm for 
Problem 4 to compute cp G Diff(T^) such that (/:?*vol = /ii, then draw random samples {xi^yi) 
from the uniform distribution (using a uniform random number generator), then map these 
samples into the /ii-distribution by {xi^yi) = (p{xi^yi). 

For a 1024 x 1024 grid on T^, with step-size 5 = 0.05 (a total of 20 steps), we obtain the 
following warp (f and Jacobian \Dip\. 


Warp 



Jacobian \D(p\ 



Green and pink shades of the Jacobian implies, respectively, expansion and contraction. The 
optimal information framework assures the warp is matching the two probability densities 
while locally minimizing metric distortion. 

We now draw 10^ uniform samples and transform them with the computed ip. 


Uniform samples Non-uniform samples 
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A benefit of transport-based methods over traditional Markov Chain Monte Carlo methods 
is cheap computation of additional samples; it amount to drawing uniform samples and then 
evaluating the transformation. A downside is poor scaling with increasing dimensions. 

Detailed comparisons of the OIT approach, developed here, to the OMT approach, devel¬ 
oped by Moselhy and Marzouk [27] and Reich [32], is beyond the scope of this paper. Since 
OIT is intrinsically simpler than OMT (Poisson instead of Monge-Ampere equation), we 
expect OIT to outperform OMT based approaches. 

4.2. Registration of ietters: J to V 

This example illustrates and explains why OIT, i.e.. Problem 4, is not suitable for image 
registration. Recall that the algorithm developed for Problem 4 works for any background 
metric. Thus, given a source density /io ^ Dens(M), we can construct a background metric 
g such that volg = /io (various ways of doing this are explained in Appendix A). Suppose 
now we want to match the letters J and V, represented as gray-scale functions Iq and Ii 
on M = T^. We might have to add some background density for black pixels, since Iq and 
Ii must be strictly positive in order for Algorithm 1 to be well-defined. Then we construct 
the conformal background metric g such that volg = /io- With step-size e = 0.05 (20 steps) 
and background density 0.2 (lowest grey-scale value, white corresponds to 1.0), we get the 
following sequence of warps. 


J J J vvv 


Source Target 


This is simply blending between the images, as foreseen by the formula (8) for Fisher-Rao 
geodesics. The corresponding mesh deformation and Jacobian of the inverse at the final 
point look as follows. 


Inverse warp 



Inverse Jacobian \D(f 



This is not a satisfactory registration: instead of transporting the white pixels of the J to 
the white pixels of the V, the resulting diffeomorphism produces white pixels by compressing 
the background pixels. This example shows that, although Problem 4 allows matching of 
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any pair of densities by using a compatible metric, only those applications where /tq is the 
standard volume form are likely to be of interest. A remedy for more general, non-compatible 
matching applications is developed next. 

5. Matching with non-compatibie background metric 

In § 3 we derived an algorithm for solving Problem 1 and Problem 2 for the case when the 
background metric g is compatible with /hq. In this section we want to derive an algorithm 
for the situation of a non-compatible background metric. When the background metric g 
is non-compatible, the solution to Problem 2 with respect to the information metric ( 6 ) on 
Diff(M) and Fisher-Rao metric (5) on Dens(M) is still obtained by a geodesic curve 
However, ip{t) is not horizontal and therefore does not project to a geodesic on the space 
of densities. As a consequence, the main ingredient of our efficient lifting algorithm—the 
explicit formula for geodesics on Dens(M)—cannot be used. From a geometric standpoint, 
the problem is that the information metric does not descend to Diff^Q(M)\Diff(M) 
unless fiQ = vol. 

To numerically solve the density matching problem using the LDDMM techniques devel¬ 
oped in [ 6 ] is plausible, but computationally expensive. In the following, we shall instead 
study a slightly modified matching problem, for which efficient algorithms can still be ob¬ 
tained. 

5.1. Inexact matching with the divergence-metric. 

The modification resides in exchanging the information metric for the degenerate divergence- 
metric given in equation (4). The degeneracy of the divergence-metric is characterized 
by 

div(Go(^-i) = 0 ^ G^^^(G,G) = 0, 

so the kernel is given by the tangent directions of the fibres of the principal bundle ( 12 ), 
explained in § 2.4. As mentioned in the introduction, the divergence-metric descends to the 
Fisher-Rao metric. If ddiv denotes the distance function of G^^^, we have 

(idiv(id, = di?(vol, (/?*vol) = (ii?((/?*vol, vol). (20) 

In consequence, the inexact density matching problem with G*^^^ and G^ is the following. 

Problem 6 (Inexact matching with divergence-metric). Given /i 0 :/^i ^ Dens(M), find 
ip G Diff(M) minimizing 

E{ip) = E{ip;jiio,jiii) = ((/:?*vol, vol) -|- ((/:?*/io,/ii), (21) 

where a > 0 is a regularization parameter, penalizing change of volume. 

Notice that we allow the source and target densities to belong to the completion Dens(M). 
This relaxation is possible because of equation (20) and the fact that the action of Diff(M) 
extends naturally from Dens(M) to Dens(M). For applications of Problem 6 the relaxation is 
important, as it allows us to treat images as densities. This only works for inexact matching, 
since Moser’s lemma requires strictly positive densities. 
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Due to the degeneracy of the divergence-metric, a solution to Problem 6 is not unique. 
Indeed, with 


Diffvoi,Mo(^) = Diff,oi(M) nDiff^,(M), 

Diffvoi,Mi(M) = Diffvoi(M) nDiff^,(M), 

we have the following result. 

Lemma 5.1. Let ip e Diff(M), r]o G Diffvoi,Aio(^)^ Vi ^ Diffvoi,/xi (^)- Then 

E{rii oip) = E{ip) = E{ip o 7?o). 

Proof. Since rji G DifFvoi,/ii (■^) we also have that G DifFvoi,/ii(-^)) since the intersection 
of two groups is again a group. Using the invariance of the Fisher-Rao metric we get 

E{r]i ocp) = ad^p{{rji)^(p^Yo\,Yo\) + ((r/i)*(/:?*/io,/ii) 

= crd|((/9*vol, r]^Yo\) H- ViLi) = T{ip). 

Again, using the invariance of the Fisher-Rao metric we get 

E{(por]Q) = ((/:?* (7/o)*vol,vol) + ((/:?* (7?o)*/^o, A^i) 

= crd|^((/:?*vol, vol) + /ii) = E{ip). 


This proves the lemma. □ 

Thus, the functional E has two different descending properties: 

1. It descends to a functional on the right cosets Diffvoi,^^! (A/’)\Diff(M), right-invariant 
with respect to Diffvoi,Ato(^)- The corresponding right principal bundle is 

Diffvoi,^! (M) c-^ Diff (M) (22) 

I 

2. It descends to a functional on the left cosets Diff (M)/Diffvoi,Aio(T ^)5 left-invariant with 
respect to Diffvoi,^i(AT). The corresponding left principal bundle is 

Diff vol, MO (AT) ^^ Diff (AT) (23) 

I 

DifF(M)/Diffvoi,^o(M). 

We need a strategy to tackle the degeneracy problem explained in Lemma 5.1. Our approach 
is simple: we impose the additional constraint on ip that it should be connected to the identity 
by a curve that is G^-orthogonal to the fibres of both principle bundles (22) and (23). Then 
Problem 6 can be solved efficiently by a gradient flow, as we now explain. 
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5.2. Gradient flow for inexact matching 

Let E denote the gradient of E with respect to the information metric . Our approach 
for Problem 6, i.e., for minimizing the functional in (21), is to discretize the gradient flow 

^ (24) 

Since the functional E is constant along the fibres of the principal bundles (22) and (23), 
the curve traced out by the gradient flow is G^-orthogonal to both fibres, as desired. 

Through formulae (9) and (21) we obtain an explicit formula for E. We can then derive the 
gradient E. It is convenient to carry out the calculations using the sphere representation 
for the densities via the W-map (7). 

Proposition 5.2. The -gradient of the divergenee-metrie matehing funetional E in (21) 
is given by 


V^E{(p) = A ^ J ^\D(p-^\Yo\^ grad + 

c( J W^WivolJ (Wi grad grad o , (25) 

where A is the inertia operator (11); Wi := := IT((^*/io); and 

arccos ( 


c: [0,1] ^ M, c{x) 


vol(M) > 


1 - 


ro\{My 


Proof. Take a curve ip{e) such that (/?(0) = cp, i.e., a variation of cp. Then 


de 


e=0 


E{^{e))=Gl{VGE,m) = 

O' ^^(<^*vol,vol), ^_^<^(e)*Mo ) • (26) 


We now write the variation of the form (p{e) = v o (p(e) for some vector field v G X(M). For 
any /i G Dens(M) we then have 


_d 

de 


e=0 


(p(e)^ja = -Cyip^ii. 


Let r := ja{M) = Then, from (30) 

d|^(/i, z/) = arccos J W{/a)W{ u)yo^ , 
so the variational derivative is given by 

^-V-" 

c(W(^x)W(u)) 
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to|^ 


Since ^et 

(If 

Notice that (i) F(/i, /i) = 0 since the variation 77 preserves the total volume, (ii) u) = 

reflecting the invariance of the Fisher-Rao distance, (hi) lim^^oo c{W {jii)W (z/)) = 
reflecting the simplified formula when the volume is infinite, and (iv) c{W{cp^YoVjW (vol)) = 
c{^y\Dip~^\). The first term in (26) now becomes 


{y/\D(p 1|) vol, vol), ^_^99(e)*vol 

ac{^/\Dip-^\) / F((/?*vol,vol)>Cv(/:?*vol = 

Jm 

ac{^/\Dip-^\) / F((/?*vol,vol)di^(/:?*vol = 

Jm 

— ac{\/\D(p~^\) / dF((/?*vol, vol) A iT((/?*vol)^i^vol = 

Jm 

- »c(v/iD^) d a H'(v.vo1)^1„vo1 = 

ac{^y\D(p-^\) / diT((/?*vol) A i^vol = 

Jm 

CFc{^/\D(p-^\) [ g{AA~^ gTdid{W{ip^YO\)),v)YO\ = 

Jm 

G(d (o-c(vtD^T)^“Arad(VF(</?*vol)), 93 = 

(o-c(v/|-D<y?“|)-4"^grad v/|-D(^-i| . (27) 
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Using the notation := W{ip^fio) and Wi := the second term in (26) becomes 


Sd , ,d .. \ 


c{W^Wi) / F{ip^jiio,fii)jCyip^jiio = 

Jm 

c{W^Wi) / F{(p^jiio,fii)diyip^jiio = 

JM 

- c{W^Wi) [ dF{(p^jJo, jji) A tU^Uvol = 

JM 

c{W^Wi) [ {WiAW^-W^AWi)M^^o\ = 

Jm 

[ g {AA-\Wi grad grad Wi), v) vol = 

JM 

{c{W^Wi)A-^{Wi grad grad Wi) oip,ip). (28) 


Put together, (27) and (28) proves the formula (25). 


□ 


Based on Proposition 5.2, we can now discretize the gradient flow (24). Indeed, a numerical 
method is given by the following algorithm. 


Algorithm 2. Assume we have a numerical way to represent functions, vector fields, and 
diffeomorphisms on M, and numerical methods for (i) composing diffeomorphisms with func¬ 
tions, (ii) composing diffeomorphisms with diffeomorphisms, (iii) computing the divergence 
of a vector field, (vi) computing the gradient of a functions, and (v) computing the inverse 
of the inertia operator A in (11). A computational algorithm for the gradient flow (24) is 
then given as follows. 

1. Choose a stepsize 5 > 0 and initialize the diffeomorphisms ipo = id, = id and a 
function Jq = 1. Precompute W{jJi) and grad(lU(/ii)). Set k ^ 0. 


2. Compute action on the source 


fk = Wifio) 

3. Compute coefficients = c( y^vol) and = c( //clU(/ii)vol), then compute 
momentum 


nik = (Tajkgrad(y4) + bkW(fii) giad fk - grad(VF(/xi)). 
(If vol(M) = oc, set a/c = 5/c = 1.) 

4. Compute the vector field infinitesimally generating the negative gradient 

Vk = -A~^mk. 
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5. Construct an approximation to ex.p{—evk)^ for example = id — svk- 

6. Update the inverse diffeomorphism: o . 

7. Update the inverse Jacobian using Lie-Trotter splitting^ 

J,+i = 

8. Construct an approximation ^pk to ex.-p{evk), for example ipk = idevk- (output) 

9. Update the forward diffeomorphism: ifk+i = 'ipk o (pj.. (output) 

10. Set k ^ k 1 and continue from step 2. 

5.3. Geometry of the gradient flow 

In this section we describe the geometry associated with the divergence-metric functional E 
in equation (21) and the corresponding gradient flow (24). 

The diffeomorphism group acts diagonally on Dens(M) xDens(M) by i^) = 

The isotropy group of (/r, u) G Dens(M) x Dens(M), i.e., the subgroup of Diff(M) that leaves 
(/i, u) invariant, is given by 


Diff^,4M) := Diff^(M) n Diff,(M). 

The action of Diff(M) on Dens(M) x Dens(M) is not transitive, so there is more than one 
group orbit. The group orbit through (vol,/io), given by 

Orb(vol, /io) •= Diff(M) • (vol, /tq) C Dens(M) x Dens(M), 

is a way to represent the quotient set Diff(M)/Diffvoi,^o(^)* This set is potentially com¬ 
plicated (an orbifold), but let us assume we stay away from singular points so we can work 
with Orb(vol,/io) as a submanifold of Dens(M) x Dens(M). The principal bundle (23) can 
then be represented as 

Diffvoi,^o (M) ^^ Diff (M) (29) 

Orb(vol,/io), 

where 

7rvol,Mo(^) = ^ • (vol,/io) = ((/9*vol,(^*/io). 

The manifold Dens(M) x Dens(M) comes with a metric, namely 

(Notice that is naturally extended to a metric on Dens(M) by the lU-map (7).) The 
corresponding distance is given by 

(M2,i^2)) = (T(Pp{ 111 , 112 ) + 3\r{vi,V2). (30) 

^Here one could use the Strang splitting to obtain higher order. 
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Figure 1: Illustration of the geometry associated with inexact density matching using the 
divergence-metric. The gradient flow on Diff(M) descends to a gradient flow on 
the orbit Orb(vol,/io)- While constrained to Orb(vol,/io) C Dens(M) x Dens(M), 
this flow strives to minimize the ambient Fisher-Rao distance dpF to (vol,/ii). 


To connect back to the divergence-metric matching in Problem 6, the key observation is 
that 

E{ip) = d^pp{{vol,IIi), (vol, Ho) • (p) = (Pppiivo\,Hl),TTvol,no{‘P))^ 

so E{(p) is simply the function (/i, d|.^((vol,/ii), on Dens(M) x Dens(M) lifted 

to Diff(M). 

Let us now discuss how the metric fits into this geometry. 

Lemma 5.3. The information metric on Diff(M) is descending with respect to the 
principal bundle (29), i.e., it descends to a metric G^rb Orb(vol,/io)* 

Proof. Translation by e Diffvoi,^o (^) ^ vector U G T^Diff(M) along the fibre of (29) 

is given by U ^ U o Therefore, the metric G^ is descending if and only if 

G^(C/V) = G^o^(C^°V’V°^), (31) 

where is the distribution that is G^-orthogonal to the tangent spaces of the fibres 

of (29). Since G^ is right-invariant, condition (31) is automatically true. □ 

Although G^ descends to a metric G^^^, the associated distance function Jorb is not ex¬ 
plicitly computable, in particular it is not given by dpF- If it was, the result in Lemma 5.3 
would allow us to use the same technique as in § 3 to solve the non-compatible, non¬ 
degenerate matching problem using G^ and G^, namely to lift the geodesics on Orb(vol, /io)* 
Our remedy is to exchange dorb for the distance function dFF on the ambient manifold 
Dens(M) x Dens(M). The geometry of our set-up is illustrated in Figure 1. 
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5.4. Two-component gradient flow 

Since both the information metric and the functional E descend with respect to (29), 
the gradient flow (24) induces a gradient flow on Orb(vol, /io) C Dens(M) x Dens(M). This 
allows us to represent the gradient flow on Dens(M) x Dens(M). 

Proposition 5.4. The gradient flow (24) deseends to a two-eomponent gradient flow equa¬ 
tion on Dens(M) x Dens(M), eonstrained to stay on Orb(vol, /io)- Expressed in the variables 
J = \D(p~^\ and P = ((/:?*/io)/vol it is given by 


j = —V ' grad J — J div(i;) 
P = —V • grad P — P div(i;) 


(32) 


with 


V = A ^ ^crc ^ J ^/j vol^ grad \/~J + 

c (J W^(/ii)VPvol)(w"(/Ui) grad VP-Vp gradW"(/Ui)) 

Proof. Let (p{t) be the solution of the gradient flow (24). Then ip{t) o ip{t)~^ = V^E{ip{t)) o 
(/:?(t)“^ =: v{t) depends on ip{t) only through (p{t)^Yol and (/:?(t)*/io, i.e., through J and P. 
We also have 


dt 


ip{t)^Yo\ = -£^(t)(/:?(t)*vol = - {Cy(t)J + div(i;(t))J) vol 


and 


—= - {£v{t)P + div(v(t))P) vol. 


This proves the result. 


□ 


This proposition tells us that an alternative way to compute the gradient flow (24) is to 
solve first (32) and then the lifting equations for the principle bundle (29), thereby obtaining 
a horizontal curve in Diff(M). We have not investigated this approach in detail. 


5.5. Relation to matching with compatible background metric 

In this section we want to show the relation between the compatible background approach 
in § 3 and the gradient flow approach developed here. Recall that the solutions to Problem 4 
and Problem 6 are obtained by the inverse of the endpoint of a horizontal geodesic, obtained 
by lifting a Fisher-Rao geodesic. As a consequence, we have the following two results. 

Lemma 5.5. Let /xq = vol,/ii G Dens(M) and let [0,1] i-^ j{t) G Diff(M) be the horizontal 
-geodesie sueh that 7 ( 1 )“^ solves the OIT problem (Problem f). Then VL^( 7 (t);/ii, vol) 
is parallel with 7 (t). 

Proof. We have E{(p; /ii, vol) = cr(i|.((/9*vol, vol)+(i|.((/?*vol,/ii). Thus, L^(-; /ii, vol) descends 
with respect to Moser’s principal bundle (12). Since the information metric G^ descends to 
the Fisher-Rao metric, the gradient flow descends to a gradient flow on Dens(M), given by 

A =-V^e(/u), e{iJ,) = aXp{vo\, ij) + (Ppiii, iJi), (33) 


26 





where is the gradient with respect to . If now /i = 7r(7(t)), then V^e(/i) is par¬ 
allel with ^7r(7(t)), since 7r(7(t)) is the unique minimizing geodesic between vol and fii. 
The result now follows since two horizontally lifted paths are parallel if they are parallel 
on Dens(M). □ 

Proposition 5.6. If /io = vol and /ii G Dens(M)^ then the limit of the gradient flow 
if = —\/aE{(p; /ii, vol); (/^(O) = id exists and eoineides with the inverse of the solution to the 
non-exaet OIT problem (Problem 5). 

Proof Follows since the gradient flow ip = —aE{p; pi, vol) descends to the gradient 
flow (33) and e{p) is a strictly convex functional and Dens(M) is a convex space with 
respect to the Fisher-Rao geometry. □ 

Remark 5.7. In contrast to the previous parts of this section we actually require here that 
Pi is strictly positive. This is indeed a necessary condition: for pi on the boundary of 
Dens(M) we cannot guarantee the existence of minimizer to the non-exact optimal informa¬ 
tion transport problem. In fact for a target density pi G Dens(M) the optimal deformation 
p will in general not be a diffeomorphisms, instead it will have a vanishing derivative on 
certain points or even intervals. To guarantee the existence of minimizers also in this situa¬ 
tion, one could use a complete metric on the diffeomorphism group as regularization term. 
Possible choices for this include higher order Sobolev metrics [12, 5, 8, 23] and metrics that 
are induced by Gaussian kernels [33]. 

6. Examples of matching with non-compatible metric 

In this section we evaluate the gradient flow based Algorithm 2 in various examples where 
Po, Pi G Dens(T^), i.e., there are regions with vanishing density (represented by black pixels). 

For simplicity, in our experiments we consider the flat, periodic torus and use fast fouler 
transform for inverting the operator A in (11). There has been extensive work on fast, 
efficient solution of Poisson’s problem on other manifolds. See, for example, the review [11]. 

6.1. Registration of ietters: J to V 

This example illustrates the capability to produce severely deformed warps, namely to warp 
the letter J into the letter V. We also examine the effect of the balancing parameter a. 

With step-size 5 = 0.2, balance a = 0.05, and 400 iterations, we get the following energy 
evolution and sequence of warps. 
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Source Final Target 


Notice in the warp that the pixel-values has changed in regions of large deformations (the 
upper left part of the final V). This is due to expansion. The corresponding mesh deformation 
and Jacobian of the inverse at the final point look as follows. 


Inverse warp Inverse Jacobian \Dip 



Notice how the lower part of the J is stretched out to the left part of the V. 

Let us now do the same experiment but with the larger balancing parameter a = 5. 




Source Final Target 


Notice here that the warp rarely changes the pixel-values, at the price of less expansion of 
the left part of the V. This is due a smaller amount of compression and expansion, as seen 
below in the corresponding mesh deformation and Jacobian of the inverse. 
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Compared to the smaller a, the Jacobian determinant is more regular and closer to one 
and the mesh deformation is almost volume-preserving. This example illustrates nicely the 
influence of the balancing parameter. 


6.2. Registration of noisy x-ray images 

In this example we illustrate the use of Algorithm 2 for registration of low-resolution, noisy 
x-ray images of human hands. With step-size 5 = 0.2, balance a = 0.1, and 400 iterations, 
we get the following energy evolution and sequence of warps. 




Source Final Target 


Except for the tip of the little finger and the thumb, the resulting path of diffeomorphisms 
yields a good warp between corresponding bone structures in the hands. The mesh defor¬ 
mation and Jacobian of the inverse at the final point look as follows. 
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Inverse warp 


Inverse Jacobian \D(p 




Here one can see how the noise affect the diffeomorphisms: both the mesh warp and the 
Jacobian are somewhat irregular, except at the left and right borders where the source and 
target densities vanish. 

A. Constructing compatible background metrics 

In § 3 we have described a method to solve the density matching problem, assuming that 
we are given a compatible background metric. If the source density /io is not equal to 
the density induced by the background metric, one has to construct such a metric first (as 
in §4.2). There is, of course, a range of background metrics h having a prescribed volume 
form; here we describe two specific choices. 

A.1. Conformal metric 

As already discussed, any volume form /i G Dens(M) can be written /i = Jvolg. Note that I 
is the Radon-Nikodym derivative of /i with respect to volg (as measures). This observation 
yields a first choice for the desired metric. 

Lemma A.l. Let jj. = Jvolg G Dens(M). Then the modified metrie 

h = lig (34) 


is eompatihle with jn, i.e., volh = ii. 

Proof. In coordinates (x^,..., x^) we have 

volh = y^det(h^ dx^ A ... A dx^ . □ 

Remark A. 2 . Note that the metric (34) is conformally equivalent to the background metric 
g. In mechanics, it is called the Jaeobi metrie. 

The advantage of the metric (34) is that it is easy to construct and also that the Laplacian 
and gradient take simple forms (in terms of the Laplacian and gradient of the original 
background metric g.) 
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Lemma A.3. The Laplacian and gradient of the metrie h = / 2 g are given by 


gradh(/) = J S gradg(/), Ah/ = j diVg(/^ 2 gradg(/)) 
Proof To calculate the expression for the gradient we use 

dfiX) = h(gradh(/),X) = g(/t gradh(/),X) = g(gradg(/),X) 
For the Laplacian we express the divergence in coordinates 

divg(X) = V) = \di{IX^) = t divg(JX). 


Then we have 

Ah/ = divh(gradh(/)) = divh(/“T gradg(/)) = j diVg(/ ■ /“t gradg(/)) 

= j divg (/^ “ t gradg (/)) □ 

Remark A.4. Note that for n = 2 the formula for the Laplacian simplifies to 



reflecting the scale invariance of the Laplacian in dimension two. 
Corollary A.5. Using the metrie (34) the lifting equation (17) reads 

(p{t) = v{t) O = id, 

v{t) = gradg (/), 

i diVg(C-t gradg(/)) = o 


A.2. Constructing a flat compatible metric 

For a flat background metric g, the Jacobi metric h given by (34) is not, in general, flat. 
Indeed, h is only fiat if I is constant. Now we will describe a method to choose a flat 
background metric assuming that the background metric g is flat. 

Lemma A.6. Assume that M earries a flat baekground metrie g. Let /i G Dens(M) and 
let ip be the solution of Problem 4 '^ith //q = vol and pi = p. Then h = ip^'g is a flat metrie 
and volh = p. 

Proof We have vol^^^h = ^*volh for any metric h and therefore h = cp'^g has the desired 
volume density. The flatness follows since the curvature tensor IZg of g satisfies 0 = IZg = 
ip I—I 
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A.3. Symmetric matching 

In the previous section we have described how to choose a metric that is compatible with a 
fixed volume form. Thus, the solution of Problem 1 with respect to the Fisher-Rao metric 
(5) can be obtained as follows: 

1. Given /io, ^ Dens(M), construct a background metric h compatible with /io by 
§ A.l or § A.2. 

2. Solve the lifting equation (17) with respect to h, using the explicit geodesic curve (8) 
between /io and /ii. 

This problem is not symmetric in /io and /ii, since the metric h depends on /io. However, 
the condition that h is compatible with /io can be made more general, which will allow us 
to derive a symmetric solution. We use the fact that any geodesic that is horizontal at 
some s G (0,1) remains horizontal for all time. Thus we do not have to choose a metric 
that is compatible with one of the prescribed densities /io and /ii, but only a metric that is 
compatible with some density that lies on the geodesic connecting the two. This suggests the 
following choice of background metric h, which leads to a symmetric solution of Problem 1: 

1. Calculate the geodesic curve /i(t) connecting /io to /ii using (8). 

2. Construct a background metric h that is compatible with the midpoint of the geodesic 
/i(|) using § A.l or § A.2. 

3. Solve the lifting equations for the part of the geodesic that connects the densities /i(|) 
and /ii with respect to the compatible metric h. Denote the resulting deformation (p. 

4. Solve the lifting equations for the part of the geodesic that connects the densities /i(^) 
and /io with respect to the compatible metric h. Denote the resulting deformation 

5. The solution is then given by (p o 

The symmetry of this strategy follows from the construction. 


B. Relation between Fisher-Rao and other distances 

In this section we compare the Fisher-Rao distance with the Wasserstein distance 

•), the Hellinger distance d^(-, •), the total variation distance •), the Kullback- 

Leibler divergence •) and the y^-distance d^(-, •) (see [15] for definitions). Note that the 

Kullback-Leibler divergence and the y^-distance are not metrics, as they are not symmetric. 
The following inequalities for probability distances are given by Gibbs and Su [15]. 

sup(%°lx!y)) ^ < inf {d^{x,y)) 

2 . 

3. 

4. 
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Here, is the distance on M. The lower bound in item 1 is meaningful only for bounded 
M and the upper bound only for discrete M. Using these inequalities, and a comparison of 
the Fisher-Rao distance with the Hellinger distance, we obtain the following result. 

Proposition B.l. For any two densities jui G Dens(M) we have 

Jp(d^°(x!y)) ^ < f inf {d^{x,y)) 

7. (r^{iio,yi) < d™(/io,Mi) < 

8 . 

9. 

Proof. To prove the first inequality we recall that equals the spherical Hellinger 

distance, see § 2.2. Thus we only have to compare the spherical distance to the Euclidean 
distance, yielding a factor Now the other inequalities follow immediately using item 1 to 
item 4. □ 

Recall from §2 that the Fisher-Rao metric arises as projection from a metric on Diff(M). 
Likewise, there is a metric on Dens(M) corresponding to the Wasserstein distance that is 
the projection of a (non-right invariant) metric on the diffeomorphism group [29]. To our 
knowledge, it is not known whether similar statements holds for the other distance functions. 
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